Model Selection

English Speech Processing

# English Speech Processing

Wav2vec2 Base Librispeech Demo Colab

This model is a speech recognition model fine-tuned on the LibriSpeech dataset based on facebook/wav2vec2-base, achieving a word error rate of 0.3174 on the evaluation set.

Speech Recognition

Distil Large V3.5 ONNX

Distil-Whisper is a knowledge-distilled version of OpenAI Whisper-Large-v3, offering superior performance and efficiency.

Speech Recognition

Transformers English

Ichigo Llama3.1 S Instruct V0.3 Phase 3

Ichigo-llama3s is a large language model series that supports both audio and text input, focusing on enhancing speech understanding capabilities and user interaction experience.

Text-to-Audio English

WhisperNER is a novel model capable of simultaneous speech transcription and entity recognition, supporting open-type named entity recognition (NER).

Speech Recognition Supports Multiple Languages

Wav2vec2 Large Lv60 Phoneme Timit English Timit 4k 002

A fine-tuned English phoneme recognition model based on facebook/wav2vec2-large-lv60 on the TIMIT dataset, achieving a phoneme error rate of 10.53%

Speech Recognition

Transformers English

Gazelle v0.2 is a joint speech-language model released by Tincans, supporting English.

Transformers English

Wav2vec2 Large Xlsr 53 English Finetuned Ravdess

A speech emotion recognition model fine-tuned on the RAVDESS dataset based on the wav2vec2-large-xlsr-53-english model

Audio Classification

Wav2vec2 Lg Xlsr En Speech Emotion Recognition Finetuned Ravdess V8

English speech emotion recognition model based on wav2vec2 architecture, fine-tuned on the RAVDESS dataset

Audio Classification

Wav2vec2 Base Speech Emotion Recognition

A speech emotion recognition model fine-tuned based on facebook/wav2vec2-base, used to predict the speaker's emotions in audio samples.

Audio Classification

Transformers English

Wav2vec2 Large 960h Intent Classification Ori

Fine-tuned intent classification model based on facebook/wav2vec2-large-960h, achieving 77.08% accuracy on the evaluation set

Audio Classification

MuhammadIqbalBazmi

Wav2vec2 Large Tedlium

Wav2Vec2 large speech recognition model fine-tuned on the TEDLIUM corpus, supporting English speech-to-text conversion

Speech Recognition English

Wav2vec2 Base Timit Demo Colab

A speech recognition model fine-tuned on the TIMIT dataset based on the facebook/wav2vec2-base model, featuring a low Word Error Rate (WER).

Speech Recognition

Wav2vec2 Base Timit Demo Google Colab

A speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base, specializing in English speech-to-text tasks

Speech Recognition

A speech recognition model fine-tuned based on facebook/wav2vec2-base-960h

Speech Recognition

A speech recognition model fine-tuned based on facebook/wav2vec2-base-960h, achieving a word error rate of 1.0 on the evaluation set

Speech Recognition

Wav2vec2 Base Dataset Asr Demo Colab

This is a speech recognition model fine-tuned on the superb dataset based on distilhubert, primarily used for Automatic Speech Recognition (ASR) tasks.

Speech Recognition

Wav2vec2 Base Timit Demo Google Colab

This model is a speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base, achieving a word error rate (WER) of 0.3384 on the evaluation set.

Speech Recognition

Assignment1 Francesco

An automatic speech recognition (ASR) model trained based on Speech-to-Text Transformer (S2T), specifically designed for English speech recognition

Speech Recognition

Transformers English

Classroom-workshop

A fine-tuned speech recognition model based on facebook/wav2vec2-base, supporting automatic speech-to-text tasks

Speech Recognition

An English speech recognition model fine-tuned on the librispeech_asr dataset based on facebook/wav2vec2-xls-r-300m

Speech Recognition

Wav2vec2 Base Timit Demo Google Colab

This model is a speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base, focusing on English speech-to-text tasks.

Speech Recognition

Wav2vec2 Base Timit Google Colab

A speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base, achieving a word error rate (WER) of 0.3355 on the evaluation set.

Speech Recognition

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base, achieving a word error rate of 0.52 on the evaluation set.

Speech Recognition

This model is a speech recognition model fine-tuned based on facebook/wav2vec2-base-960h, achieving a word error rate (WER) of 1.0 on the evaluation set.

Speech Recognition

Wav2vec2 Base Timit Demo Google Colab

A speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base, suitable for English speech-to-text tasks

Speech Recognition

Wav2vec2 Base Timit Demo Google Colab

This model is a speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base, primarily used for English speech-to-text tasks.

Speech Recognition

patrickvonplaten

Wav2vec2 Base Timit Demo Colab92

A speech recognition model fine-tuned on the TIMIT dataset based on the facebook/wav2vec2-base model

Speech Recognition

Wav2vec2 Base Timit Demo Colab90

A speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base, specializing in English speech-to-text tasks

Speech Recognition

Wav2vec2 Base Timit Demo Colab11

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base, achieving a word error rate of 0.4348 on the TIMIT dataset.

Speech Recognition

Wav2vec2 Base Timit Demo Colab 1

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base, trained on the TIMIT dataset with a word error rate (WER) of 0.4398.

Speech Recognition

Wav2vec2 Base Timit Demo Colab2

This model is a speech recognition model fine-tuned from facebook/wav2vec2-base, achieving a word error rate (WER) of 0.5664 on the evaluation set.

Speech Recognition

Wav2vec2 Base Timit Ali Hasan Colab EX2

A speech recognition model fine-tuned from facebook/wav2vec2-base on the TIMIT dataset, with a WER of 0.4458 on the evaluation set

Speech Recognition

Wav2vec2 Base Timit Ali Hasan Colab

A speech recognition model fine-tuned from facebook/wav2vec2-base, trained on the TIMIT dataset

Speech Recognition

Wav2vec2 Base Timit Moaiz Exp2

A speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base

Speech Recognition

Wav2vec2 Base Timit Demo Colab

A speech recognition model fine-tuned on the TIMIT dataset based on the wav2vec2-base model

Speech Recognition

Wav2vec2 Base Timit Demo Colab

A fine-tuned speech recognition model based on facebook/wav2vec2-base, trained and evaluated on the TIMIT dataset.

Speech Recognition

Wav2vec2 Base Timit Demo Colab

A speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base, for demonstration purposes

Speech Recognition

Wav2vec2 Base 960h Timit Demo Colab

A speech recognition model fine-tuned based on facebook/wav2vec2-base-960h, achieving a 21.6% word error rate on the TIMIT dataset

Speech Recognition

Wav2vec2 Base Timit Demo Colab

A speech recognition model fine-tuned based on facebook/wav2vec2-base, trained on the TIMIT dataset with a Word Error Rate (WER) of 0.3468.

Speech Recognition

Wav2vec2 Child En Tokenizer 4

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m, specifically designed for English child speech recognition tasks.

Speech Recognition

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase